195 research outputs found

    Derivative observations in Gaussian Process models of dynamic systems

    Get PDF
    Gaussian processes provide an approach to nonparametric modelling which allows a straightforward combination of function and derivative observations in an empirical model. This is of particular importance in identification of nonlinear dynamic systems from experimental data. 1)It allows us to combine derivative information, and associated uncertainty with normal function observations into the learning and inference process. This derivative information can be in the form of priors specified by an expert or identified from perturbation data close to equilibrium. 2) It allows a seamless fusion of multiple local linear models in a consistent manner, inferring consistent models and ensuring that integrability constraints are met. 3) It improves dramatically the computational efficiency of Gaussian process models for dynamic system identification, by summarising large quantities of near-equilibrium data by a handful of linearisations, reducing the training size - traditionally a problem for Gaussian process models

    Learning Deep Mixtures of Gaussian Process Experts Using Sum-Product Networks

    Get PDF
    While Gaussian processes (GPs) are the method of choice for regression tasks, they also come with practical difficulties, as inference cost scales cubic in time and quadratic in memory. In this paper, we introduce a natural and expressive way to tackle these problems, by incorporating GPs in sum-product networks (SPNs), a recently proposed tractable probabilistic model allowing exact and efficient inference. In particular, by using GPs as leaves of an SPN we obtain a novel flexible prior over functions, which implicitly represents an exponentially large mixture of local GPs. Exact and efficient posterior inference in this model can be done in a natural interplay of the inference mechanisms in GPs and SPNs. Thereby, each GP is -- similarly as in a mixture of experts approach -- responsible only for a subset of data points, which effectively reduces inference cost in a divide and conquer fashion. We show that integrating GPs into the SPN framework leads to a promising probabilistic regression model which is: (1) computational and memory efficient, (2) allows efficient and exact posterior inference, (3) is flexible enough to mix different kernel functions, and (4) naturally accounts for non-stationarities in time series. In a variate of experiments, we show that the SPN-GP model can learn input dependent parameters and hyper-parameters and is on par with or outperforms the traditional GPs as well as state of the art approximations on real-world data

    Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

    Get PDF
    Gaussian processes (GPs) are a powerful tool for probabilistic inference over functions. They have been applied to both regression and non-linear dimensionality reduction, and offer desirable properties such as uncertainty estimates, robustness to over-fitting, and principled ways for tuning hyper-parameters. However the scalability of these models to big datasets remains an active topic of research. We introduce a novel re-parametrisation of variational inference for sparse GP regression and latent variable models that allows for an efficient distributed algorithm. This is done by exploiting the decoupling of the data given the inducing points to re-formulate the evidence lower bound in a Map-Reduce setting. We show that the inference scales well with data and computational resources, while preserving a balanced distribution of the load among the nodes. We further demonstrate the utility in scaling Gaussian processes to big data. We show that GP performance improves with increasing amounts of data in regression (on flight data with 2 million records) and latent variable modelling (on MNIST). The results show that GPs perform better than many common models often used for big data.Comment: 9 pages, 8 figure

    Rates of Convergence for Sparse Variational Gaussian Process Regression

    Get PDF
    Excellent variational approximations to Gaussian process posteriors have been developed which avoid the O(NĀ³) scaling with dataset size N. They reduce the computational cost to O(NMĀ²), with Mā‰ŖN being the number of inducing variables, which summarise the process. While the computational cost seems to be linear in N, the true complexity of the algorithm depends on how M must increase to ensure a certain quality of approximation. We address this by characterising the behavior of an upper bound on the KL divergence to the posterior. We show that with high probability the KL divergence can be made arbitrarily small by growing M more slowly than N. A particular case of interest is that for regression with normally distributed inputs in D-dimensions with the popular Squared Exponential kernel, M = O(log^DN) is sufficient. Our results show that as datasets grow, Gaussian process posteriors can truly be approximated cheaply, and provide a concrete rule for how to increase M in continual learning scenarios

    Sparse Gaussian Process Hyperparameters: Optimize or Integrate?

    Full text link
    The kernel function and its hyperparameters are the central model selection choice in a Gaussian proces (Rasmussen and Williams, 2006). Typically, the hyperparameters of the kernel are chosen by maximising the marginal likelihood, an approach known as Type-II maximum likelihood (ML-II). However, ML-II does not account for hyperparameter uncertainty, and it is well-known that this can lead to severely biased estimates and an underestimation of predictive uncertainty. While there are several works which employ a fully Bayesian characterisation of GPs, relatively few propose such approaches for the sparse GPs paradigm. In this work we propose an algorithm for sparse Gaussian process regression which leverages MCMC to sample from the hyperparameter posterior within the variational inducing point framework of Titsias (2009). This work is closely related to Hensman et al. (2015b) but side-steps the need to sample the inducing points, thereby significantly improving sampling efficiency in the Gaussian likelihood case. We compare this scheme against natural baselines in literature along with stochastic variational GPs (SVGPs) along with an extensive computational analysis.Comment: NeurIPS 202

    Egalitarian justice and expected value

    Get PDF
    According to all-luck egalitarianism, the differential distributive effects of both brute luck, which defines the outcome of risks which are not deliberately taken, and option luck, which defines the outcome of deliberate gambles, are unjust. Exactly how to correct the effects of option luck is, however, a complex issue. This article argues that (a) option luck should be neutralized not just by correcting luck among gamblers, but among the community as a whole, because it would be unfair for gamblers as a group to be disadvantaged relative to non-gamblers by bad option luck; (b) individuals should receive the warranted expected results of their gambles, except insofar as individuals blamelessly lacked the ability to ascertain which expectations were warranted; and (c) where societal resources are insufficient to deliver expected results to gamblers, gamblers should receive a lesser distributive share which is in proportion to the expected results. Where all-luck egalitarianism is understood in this way, it allows risk-takers to impose externalities on non-risk-takers, which seems counterintuitive. This may, however, be an advantage as it provides a luck egalitarian rationale for assisting ā€˜negligent victimsā€™
    • ā€¦
    corecore